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ABSTRACT.*  ^Measures  of  dispersion  are  defined  as  functionals 
satisfying  certain  equivariance  and  order  conditions.  Attention 
is  restricted  to  symmetric  distributions.  Different  measures  are 

( 

• compared  in  terms  of  asymptotic  relative  efficiency,  i.e.,  the 

( inverse  ratio  of  their  standardized  variances.  The  efficiency  of  a 
trimmed  to  the  untrimmed  standard  deviation  turns  out  not  to  have 

1 

a positive  lower  bound  even  over  the  family  of  Tukey  models. 

"Positive  lower  bounds  for  the  efficiency  (over  the  family  of  all 
symmetric  distributions  for  which  the  measures  are  defined)  exist 
if  the  trimmed  standard  deviations  are  replaced  by  pth  power 
deviations.  However,  these  latter  measures  are  no  longer  robust, 
although  for  p < 2 they  are  more  robust  than  the  standard 
deviation.  The  results  of  the  paper  suggest  that  a positive  bound 
to  the  efficiency  may  be  incompatible  with  robustness  but  that 
trimmed  standard  deviations  and  pth  power  deviations  for  p = 1 
or  1.5  are  quite  satisfactory  in  practice. 

X 
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1,  Measurer,  of  dispersion 

In  analogy  v;ith  the  defir  if ion  of  a measure  of  location,  we 

shall  define  a measure  of  dispersion  to  be  a functional  (defined 

over  a sufficiently  large  family  of  distributions)  which  satisfies 

certain  invariance  conditions  and  which  in  addition  has  the  property 

of  assigning  a larger  value  to  G than  to  F if  G is  more 

dispersed  than  F.  In  the  present  paper  we  shall  consider  the 

* 

problem  for  symmetric  distributions  and  assume  that  X is  a 
random  variable  whose  distribution  F is  symmetric  about  p.  It 
then  seems  natural  to  interpret  dispersion  in  terms  of  the  distance 
of  X from  p,  that  is,  in  terms  of  the  magnitude  of  |X-p|, 
and  to  consider  Y as  more  dispersed  about  v than  X about  p if 
(1.1)  | Y-v  | is  stochastically  larger  than  |X-p|. 

(This  is  essentially  the  "peakcdness"-ordering  introduced  by  Z.  VJ. 
Birnbaum  (19^8),) 
i,'ote  that 

(a)  any  symmetric  random  variable  is  more  dispersed  than  a 
constant; 

(b)  aX  is  more  dispersed  than  X if  a > 1. 

If  F and  G ace  symmetric  about  0 with  densities  f and  g, 
a simple  sufficient  condition  for  (1.1)  with  p = v = 0 


Ve  e::pect  to  take  up  the  asyiamatric  case  in  a subsequent  paper. 
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(1.2) 


g(x)/f(x)  is  increasing  for  x > 0. 

If  F and  G are  symmetric  about  zero,  and  G is  more  dis- 


persed than  F,  and  if 


(1.3) 


Hfl(x)  = 9G(x)  + (l-e)F(x), 


then  H0  is  more  dispersed  than  F for  any  0 < d < 1.  As  an 
illustration,  note  that  a standard  normal  distribution  contaminated 
with  another  normal  distribution  with  zero  mean  and  variance  > 1 
(Tukey  model)  is  more  dispersed  than  the  uncontaminated  standard 
normal  distribution. 

An  important  class  of  examples  is  provided  by  the  following 
result,  which  is  a generalization  of  a lemma  of  Birnbaum  (194P)  . 

Theorem  1.  Let  X^,  Y^  ( i=l , 2)  be  independent  with  distribu- 
tions F . , G.,  (i=l,2)  which  are  symmetric  about  zero,  and  suppose 

X a. 


that 


(i)  is  more  dispersed  than  for  i=l,2 


(ii)  F^  and  Gg  have  unimodal  densities  and  possibly  some 
probability  mass  at  zero. 

Then  Y^  + Yg  i3  more  dispersed  than  + Xg. 

Proof.  Consider  the  probability 

P(|XX  +X2|  < c)  = 2 JQ°  [F1(x+c)  -F1(r.-c)]dF2(x). 


i.  to 


The  unimodality  of  implies  that  the  integrand  on  the  right-hand 

side  is  a decreasing  function  of  x.  From  the  fact  that  (FOJ  G,,) 
satisfies  (1.1),  it  then  follows  that  this  last  integral  is  decreased 
when  F2  is  replaced  by  G 2>  Thus, 

P(|X1+X2|  Sc)  i 2/*tF1(x+c)  - Fjfx-c)  ]dGg(x) 

= 2 /“[ G2(x+c)  - Gg(x-c)  JdF^x). 

Repeating  the  argument  (this  time  using  the  unimodality  of  G2) , we 
arrive  at  the  desired  result.  Birnbaum  has  shown  that  Theorem  1 no 
longer  holds  when  assumption  (ii)  is  dropped. 

Consider  now  a functional  't(F)  [also  denoted  by  i (X)  when 
X is  a random  variable  with  distribution  F]  defined  over  a 
sufficiently  large  class  of  distributions  which  is  closed  under 
changes  of  location  and  scale.  We  shall  require  t to  be  nonnega- 


tive and  to  satisfy 


(1.4) 


'i(aX)  for  a > 0 


(1.5) 


i(X+b)  = t(X)  for  all  b. 


It  f ollovjs  from  (1.5)  and  the  symmetry  of  F that 


(1.6) 


t(-X)  = i(X) 


so  that  (1.4)  holds  for  all  a 4 0. 

From  (1.4)  and  (1.5)  it  is  easily  seen  that 

(1.7)  't(c)  = 0 for  any  constant  c, 


For  by  (1.4)  , we  have  i(0)  = i(?.XO)  » (0)  and  hence  t(0)  =*  0, 

and  by  (1.5),  i(c)  » i:(0).  The  converse,  that  i (X)  = 0 requires 

X to  be  a constant  with  probability  1,  will  in  general  not  hold. 

An  example  is  provided  by  the  trimmed  standard  deviation  defined  in 

Section  3 below. 

% 

A nonnegative  functional  t satisfying  (1.4)  and  (1.5)  will  be 
called  a measure  of  dispersion  if  it  satisfies  in  addition 

(1.9)  t(F)  = t(G)  whenever  G is  more  dispersed  than  F. 


Note  that  if  t:(F)  is  a measure  of  dispersion,  so  is  k-i(F)  for 
any  k > 0. 

A large  and  important  class  of  dispersion  measures  is  provided 
by  the  functionals 

(1.10)  1(F)  - (/JfF^ft)  ]7  <JA(t))1/7 

where  F is  assumed  to  be  symmetric  about  |x,  F*  denotes  the 
distribution  of  |X  - n|,  A is  any  probability  distribution  on 
(0,1)  and  7 any  positive  number. 

That  (1.10)  satisfies  (1.4)  and  (1.5)  is  easily  checked;  that 
it  satisfies  (1.9)  follows  from  the  fact  that  F*^(t)  = G*^(t)  for 
all  t when  G*  is  stochastically  larger  than  F*. 

A special  case  of  (1.10)  is  the  standard  deviation  (SD)  of  F 
defined  as 

(1.11)  SD(F)  - [J'(x-n)2dF(x)J1/2  . 


? A 


(l.li) 
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This  is  easily  seen  to  be  given  by  (1.10)  with  7=2  and  a the 
uniform  distribution  on  (0,1).  The  following  three  important  classes 
of  measures,  all  special  cases  of  (1.10)  provide  the  alternatives  to 


the  standard  deviation  with  which  we  shall  be  concerned. 

2 


(i)  A generalization  1 (F;  p)  of  the  standard  deviation 
is  the  pth  power  deviation  obtained  by  replacing  7 by  p in 
(1.10)  and  letting  A be  the  uniform  distribution  on  (0,1). 

(ii)  The  doubly  trimmed  standard  deviation  t(F;  a,  (3)  is 
given  by  (1.10)  with  7 » 2 and  A the  uniform  distribution  on 
(a,  1 - p)  . The  most  important  example  of  this  is^the  case  a = 0. 

(iii)  The  ath  quantile  is  obtained  from  (1.10)  by  letting 
A assign  probability  1 to  the  point  a.  The  resulting  measure 
is  independent  of  7. 

The  standard  deviation  is  of  course  a member  of  both  (i)  and 
(ii)  . The  ath  quantile  is  the  limit  of  the  doubly  trimmed 
standard  deviation  as  p-*a, 

2.  Estimation 


A most  important  aspect  in  comparing  two  measures  of  scale 
t^(F)  and  'tj(F)  is  the  accuracy  with  which  they  can  be  estimated. 
Unfortunately,  it  is  no  longer  possible  to  compare  these  accuracies 
directly  in  terms  of  the  asymptotic  variances  of  the  estimators. 
This  is  clearly  seen  by  considering  the  case  » ct^  where  c 
is  any  positive  constant.  If  is  a possible  measure  of  scale 

and  its  estimator  is  6 one  would  be  equally  happy  to  use  1 ^ 


and  estimate  it  by  60  = ; of  course  the  asymptotic  variance 

2 

does  not  remain  the  same  but  gets  multiplied  by  c . For  this  reason 

a natural  measure  of  accuracy  of  an  estimator  of  'r(F)  with 

2 2 

asymptotic  variance  v (F)  is  not  v (F)  itself  but  the  scale 
invariant  standardized  asymptotic  variance  (already  proposed  by 


Daniell  (1920)) 

(2.1) 


v2(F)  /'t2(F)  . 


The  asymptotic  efficiency  e . of  b (estimating  i ) to  b 

2 9 i d d'  1 

(estimating  r^)  will  then  be  defined  as 


(2.2) 


v?(F)  v2(F) 

e (F)  *=  -j /-§ 

*l(F)  4^ 


If  Vn^  (T^  - t^)  is  asymptotically  normally  distributed  for 
i»l,2  as  the  number  n^  of  observations  tends  to  infinity,  the 
usual  argument  shows  that  the  asymptotic  efficiency  (2.2)  is  the 
limiting  ratio  of  the  numbers  of  observations  required  by  the  two 
estimators  to  achieve  the  same  standardized  variance. 

That  the  above  definitions  are  reasonable  can  be  seen  from 
another  point  of  view.  The  logarithm  of  b is  an  estimator  of  the 
location  parameter  log  t(F).  Suppose  that  the  distribution  of 
V n (b  - t)  tends  to  the  normal  distribution  with  zero  mean  and 

variance  v . Then  the  distribution  of  V n (log  b - log  r)  tends 

2 2 

to  the  normal  distribution  with  mean  zero  and  variance  v /t  (F)  ; 

2 2 

that  is,  v /t  (F)  is  the  asymptotic  variance  of  the  location 
estimate  log  b. 


When  studying  the  estimators  of  functionals  such  as  those 
defined  in  Section  1,  it  is  convenient  first  to  consider  F re- 


stricted to  distributions  which  are  symmetric  about  a known  point  p. 
On  the  other  hand,  in  order  even  to  define  the  estimators  of  i 
we  wish  to  study,  it  is  necessary  to  extend  i to  asymmetric 
distributions.  In  all  the  examples  to  be  considered  here,  there 
is  a natural  extension  of  t to  asymmetric  distributions.  Given 
this  extension,  we  define  as  estimator  of  r(F)  the  functional  i 
evaluated  at  the  empirical  distribution  function  £ of  X^,*»*,Xn. 
In  what  follows,  we  shall  assume  without  loss  of  generality  that  the 


known  value  p of  the  center  of  symmetry  of  F is  p «*  0. 

measure  of  scale  is  the  standar 
The  point  of  view  in  the  present  paper  will  be  that  the  standard 


deviation  n:(F;  0,  0)  which  (since  p «*  0)  is  estimated  by 


/ n 


(2.3) 


6 = / 2 Xf/n  . 


V i= 


This  estimator  is  well  known  to  be  very  unsatisfactory  because  of 
its  extreme  sensitivitt  to  outlying  observations.  We  shall  the"e- 
fore  look  at  the  other  functionals  under  consideration  as  competitui.4: 
of  t ^ and  hence  shall  be  interested  principally  in  comparing  their 

behavior  with  that  of  6^.  Unfortunately,  we  are  only  able  to  make 

o 

these  comparisons  asymptotically.  For  2X^/n,  it  is  of  course 


— 2 2 

obvious  from  the  central  limit  theorem  that  Vn[2X^/n  - o ] is 


asymptotically  normal  with  zero  mean  and  variance  Var(X  ) provided 
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|-  I.  m 


p 2 

the  later  variance  is  finite.  Here  a~  = E(X  ) denotes  the 
variance  of  X. 

It  follows  that  Vn(o^  - a)  is  asymptotically  normal  with  mean 

2 2 

zero  and  variance  Var(X  ) /4cr ~.  Thus  the  standardized  variance  is 


(2.4) 


vf(F) 


- I fE(X4)  " 

~ 4 | 4 ‘ - 

i-  rr  _ 


In  the  next  sections  we  shall  obtain  the  corresponding  expansion 
for  the  estimators  of  some  other  functionals,  and  then  study  the 
efficiencies  (2.2). 

3.  The  doubly  trimmed  standard  deviation 

To  replace  the  standard  deviation  as  a measure  of  scale,  we 
shall  seek  among  the  measures  discussed  in  Section  1 one  which  would 
be  more  robust  but  which  still  can  be  estimated  fairly  efficiently. 
Encouraged  by  the  results  of  BLII,  we  shall  begin  this  search  by 
studying  the  Trimmed  standard  deviations. 

As  before,  let  the  random  variables  X^  be  independently 
distributed  according  to  a distribution  F,  which  is  symmetric  with 

p 

respect  to  the  origin.  Let  Y = X and  denote  the  distribution 

p 

of  Y by  G.  We  can  then  write  a (F;  a,  p)  (with  a,  p s 1/2)  as 


(3.1) 


^2(f;  a>  = idbp  Ja  G"1(t)dt  = /U1_P  ydG(y) 


where  u^  is  the  ath  percentile  of  G. 

Consider  now  the  estimator, 

2 A def  A2 

(3.2)  t (F,  a,  p)  = * 'c  (a,  fj)  . 


10 
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This  is  obtained  by  trimming  off  the  100  at  observations  with 


the  smallest  absolute  values  and  the  100  fit  observations  with  the 


largest  absolute  values  and  then  computing  the  standard  deviation  of 


the  remaining  observations.  In  terms  of  the  Y’s,  t is  a doubly 


trimmed  mean.  Its  expectation  is  given  by  (3.1)  and  its  asymptotic 


variance  is  (see  for  example  BL  II  (1975)) 


(3.3) 


— - — p ■ c(a»p)  ]2dG(y) 

(l-a-p) d a 


+ a[ua  - c(a,p)  f + - c(a,0)  f 


where 


(3.4) 


G(a»P)  - ydG(y)  +aua  + 


(3.5) 


ut-°'V>  ■ [F_1  H1]2  ■ 


(This  formula  holds  if  u^,  u^  ' are  uniquely  defined  and  G is 
continuous  at  u^,  u^  (see  Stigler  (1973)). 


To  get  an  idea  of  the  behavior  of  such  procedures  with  respect 


to  the  untrimmed  standard  deviation  we  consider  some  representative 


cases  namely  the  (singly)  trimmed  standard  deviations  with  j 


p = .1,  .2,  -a  = 0.  The  following  tables  computed  by  Winston  Chow 


and  W.  Carmichael  show  the  efficiencies  (2.2)  of  these  estimators  with 


respect  to  the  standard  deviation  for  the  following  three  classes  of 


distributions. 
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I 


(a)  t - distributions  with  5,  6,  7,  8,  9,  10,  25,  50  degrees 
of  freedom. 

(b)  Symmetric  beta  distributions  with  densities, 

(3.6)  f(x)  = [B(r,r)  ]_I(|  - t8)*'1,  |t|  i| 

for  r = |,  2,  <t,  6 , 8,  10  , 20  , 30. 

(c)  Tukey  normal  gross  error  distributions,  i.e. , 

(3.7)  F(js)  - (l-e)*(x)  + e*(£), 

for  e - .025,  .075,  .10,  .20,  .40,  .50  and  A - 2,  4,  6. 

These  models  were  selected  as  representing  a range  of  long  and 
short  tailed  distril  utions  and  for  ease  of  computation. 

The  figures  suggest  that,  on  the  whole,  for  heavy-tailed 
distributions  both  trimmed  SD's  are  better  than  the  untrimmed  SD 
and  that  for  the  ranges  considered  5 = .1  is  preferable  to  P = .2. 
For  light-tailed  distributions  such  as  the  Beta-distribution,  the 
untrimmcd  SD  does  best,  p = .1  does  better  than  p = .2,  and 
P = .1  performs  reasonably  well. 


f 

5 

6 

7 

8 

9 

10 

25 

50 

00 

p = 

.1 

2.35 

1.56 

1.29  1.16 

1.08 

1.03 

.85 

.81 

.78 

p = 

.2 

2.11 

1.36 

1.11 

.99 

.92 

.87 

.69 

. 66 

.63 

Table  3.1:  Asymptotic  efficiency  of  a -trimmed  with  respect 

to  untrimmed  SD:  t-distribution  with  f 

degrees  of  freedom 
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r 

.5 

1.0 

2.0  4.0  6.0 

8.0 

10.0 

. 0.0  30.0  20 

**  . 1 

1.30 

.86 

.72  .70  .71 

.72 

.73 

.75  .76  .78 

= .2 

.83 

.57 

.52  .54  .55 

.57 

.57 

. 60  .61  . 63 

Table  3.2 

: Asymptotic  efficiency  of 

a-trimmed  with  respect 

to 

untrimmed  SD: 

Beta  distribution  with  density 

P - .1 

P = 

.2 

\ 

. t ' 

L. 

X 

2 

4 6 

2 

4 

6 

0 

.78 

00 

o- 

• 

CO 

c*- 

• 

.63 

.63 

.63 

.025 

.98 

3.97  10.04 

.80 

3.27 

8.36 

.075 

1.19 

4.01  6.48 

.99 

3.57 

6.03 

.10 

1.23 

3.48  4.79 

1.03 

3.27 

4.93 

• 

20 

1.21 

1.46  .76 

1.05 

2.04 

2.26 

• 

4o 

• 96 

.63  .50 

.87 

.68 

.38 

• 

50 

.87 

.63  .55 

.79 

.52 

.36 

Table  3.3:  Asymptotic  efficiency  of  a-trimmed  with  respect  to 

untrimmed  SD:  Tulcey  model  (3.7) 


A surprising  feature  of  Table  3.3  are  the  extremely  high  effi- 
ciency values  for  small  e > 0 and  large  i . These  seem  to  arise 
in  cases  where  there  is  enough  trimming  to  insure  with  very  high 
probability  that  only  a small  proportion  of  the  gross  errors  is 
retained  in  the  trimmed  sample.  The  standardized  variance  of  the 
untrimmed  SD  then  rises  very  sharply  with  i while  that  of  the 
trimmed  SD  is  affected  only  little  as  i gets  large.  The  curious 


1 


-m 


i i 


Wsfit  * 
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"dip"  which  occurs  in  the  neighborhood  of  e - p is  predicted  by 
the  asymptotic  theory  of  the  next  section. 

4.  Nonexistence  of  a lower  bound 

The  numerical  results  of  the  preceding  section  are  encouraging 
and  raise  the  hope  that,  as  in  the  case  of  the  trimmed  means  as 
measures  of  location,  the  efficiencies  of  the  trimmed  to  the  un- 
trimmed SD  have  a positive  lower  bound.  Unfortunately,  this  turns 
out  not  to  be  the  case  even  if  attention  is  restricted  to  unimodal 
distributions.  In  fact,  even  within  the  class  of  Tukey  models  (3.7) , 
the  efficiencies  can  take  on  arbitrarily  small  values. 

Theorem  2.  Let  e(P;  e,  A)  denote  the  asymptotic  efficiency 

A 

of  the  trimmed  standard  deviation  t(0,  p)  relative  to  the 
untri-ned  standard  deviation,  in  the  Tukey  model  (3.7).  Then 


(4.1) 


lira  e(P;  p,  X)  - 0. 


The  proof  of  this  result  is  based  on  the  following  two  lemmas. 
Le™’ra  1.  For  the  Tukey  model  (3.7),  the  standardized  asymptotic 
variance  of  the  SD 


(4.2) 


ilmi. . d 

4 4V)  J 


is  bounded  above  uniformly  in  7 for  any  fixed  e. 

Proof.  For  the  model  (3.7),  the  standardized  variance  (3.9) 
becomes 

\[(1  - E)  + EA  2f  ) 
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Since  this  is  continuous  in  A we  need  only  consider  its 
behavior  as  A -*•  » when  it  is  clearly  bounded. 

Lemma  2.  If  e - p in  (3.7)  and  if  u^  * is  given  by 

(3.5)  then  as  a function  of  A, 


Proof.  Let 


ui  l ~ 2 log  A as 

i-p 


" -/4-P  • 


A -*•  00  . 


Then  x(A)  satisfies. 


(4.3) 


(1-P)*M  +P*(r)  - 1 - I • 


First  note  that  as  A — *■  « we  must  have, 


(4.4) 


X (A)  -*■  00 


(^.5) 


x(A) 


For  if  (4.4)  does  not  hold  there  exists  a sequence  An  such  that, 

x(A  ) -+  c < 00, 

' n 


rhen. 


M\)\ 

(l-P)f(K(V ))  + |S*— r -I-  (l-P)t(c) 

V n ' 


+ § < 1 - a 


and  this  contradicts  (4.3).  Similarly  if  (4.4)  holds  but  (4.5)  is 

t 

violated  we  could  find  a sequence  (An}  such  that, 


lio  (1-P)*(X(X  ))  + {M> f 

n V Xn 

and  we  would  again  have  a contradiction. 

Now,  by  (3.12)  and  (3.13), 


> i - r». 


(4.6) 

and 

(t-7) 


1 " *[*(*)]  “ 'Plx(()l]  [1  + “t1)3 


•^)  -I.alalslil  [i+o(i)]. 


Substituting  (4.6)  and  (4.7)  in  (4.3),  cancelling,  and  then  taking 
logs  we  obtain, 

2 

(l!.8)  2 " log  X(A)  = log  X(A)  ~ log  ^+lo8  y^  + °(l) 

By  (4.4) 


log  x ( A)  - o(x  (A)) 


and  we  conclude  that 


(4.8) 


x (A)  (1  + o ( 1) ) = 2 log  A 


which  was  to  be  proved. 
Proof  of  Theorem  2. 

that, 

(j'.9) 

while 


□ 

In  view  of  Lemma  1 we  only  need  to  show 


V _ — ► oo 

0,P 
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(4.10)  t2(F;  0,  p)  = 0(1)  as  X - ». 

To  shew  (4.9)  we  calculate, 

(1-P)t2(F;0,  p)  = (1-P)/^)  x2(p(x)dx  +|  /^^x2q)(|)d3c 

* (l-p)+2? A2/*(A)/A  y2cp(y)dy 

* (1  - P)  + 

and  the  result  follows.  This  argument  and  Lemma  2 show  that 

(4.11)  C(0 , p)  = (l-P)T(F;  0,  p)  +Pux_p  - pu^ 

where  C(a,  p)  is  given  by  (3.4). 

Finally,  from  (4.11) 

vo,p  4 P(ux-p  ' c(°-  p) ) 2 ~ P ( 1 - P) 2ui_p 

which  tends  to  w,  and  (4.9)  and  the  theorem  follow.  [] 

Remark  1.  The  same  arguments  show  that  the  result  of  Theorem  2 

2 A 2 A 

continues  to  hold  if  i (F;  0,  p)  is  replaced  by  i (F;  a,  p) 
for  any  a < 1 - p,  and  if  as  before  e = p. 

Remark  2.  In  addition  to  predicting  that  the  efficiency  of  tbc 
p-trimmed  SD  to  the  SD  tends  to  0 as  t -*■  » for  e ® p the 
asymptotic  theory  also  gives  a positive  Hrait  for  e £ p.  Roughly 
speaking,  for  e > p the  behavior  is  governed  by  the  contaminant 
while  for  e < p it  is  governed  by  the  main  portion.  Clearly 
Table  3.3  reflects  this  only  very  crudely.  For  the  values  of  i 
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we  have  given, the  efficiency  at  e = p remains  quite  large.  However, 
further  numerical  calculations  for  extremely  large  values  of  t (of 
the  order  of  50  and  above)  confirm  the  asymptotic  theory  although 
convergence  is  slow.  It  is  seen  from  the  tables  that  small  efficiency 
values  do  appear  for  smaller  values  of  i and  e > p.  These  pre- 
sumably reflect  small  positive  limits  which  are  approached  more 
rapidly  than  the  asymptotic  value  for  e = p. 

Remark  3.  An  interesting  limiting  case  of  the  doubly  trimmed 
standard  deviations  arises  as  both  a and  p tend  to  1/2.  The 
limiting  functional  is  naturally  taken  to  be, 

(**.12)  t(F;  §,  \)  = •/G‘1(i), 

the  median  of  the  distribution  of  |X  - p.  | which,  since  X is 
symmetric  about  p.,  coincides  with 

- F_1(£), 

A 1 1 

the  interquartile  range  of  F.  The  estimator  t(F;  , — ) has 
asymptotic  standardized  variance, 


(4.13) 


16x2f2(x) 


where  F(x)  =3/4. 

Again  for  the  Tukey  model  with  e = .5  the  asymptotic  effi- 
ciency of  this  estimator  relative  to  the  SD  tends  to  zero  as  -<• 
Numerical  values  of  the  efficiency  similar  to  those  shown  in  Table  3.3 


t f 


Winston  Chow. 


■v  ^ 

4 

6 

0 

; .37 

.37 

.37 

.1 

: .62 

i 

2.05 

3.16 

.2 

; .65 
1 

1.43 

1.78 

.3 

| .61 

1.01 

1.11 

.4 

: *56 

1 

.73 

.70 

.5 

1 .51 

.52 

.43 

■ • 

Asymptotic 

efficiency  of 

:f;  i 2) 


with  respect  to  the  SD  for  the 
Tukey  model  (3.7) . 


They  are  clearly  much  less  satisfactory  than  the  corresponding 
values  for  a = .1.  In  particular,  the  low  values  at  the  uncontami 
nated  normal  make  this  measure  unsuitable. 

The  result  of  Theorem  2 is  rather  disappointing,  particularly 
since  it  is  in  such  contrast  to  the  general  (asymmetric)  location 
case.  One  may  ask  whether  a positive  lower  bound  can  be  obtained 
if  the  trimmed  SD  is  replaced  by  a measure  of  the  form  (1,10)  with 
y =»  2 but  more  general  A.  An  essentially  negative  answer  (if 
attention  is  restricted  to  robust  measures)  is  provided  by  the 
following  generalisation  of  Theorem  2. 
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Theorem  3.  Suppose  that  t(F)  is  given  by  (1.10)  where  the 
weight  function  A has  a density  A*  which  vanishes  outside 

[a,  1 - P]  with  0<a<l-p<l  and  is  bounded  away  from  0 

2 A 

inside.  Then  the  relative  efficiency  of  % (F)  relative  to  the 
standard  deviation  tends  to  0 as  X -*  «>  in  the  Tukey  model  (3.7) 
with  e = p. 

For  the  proof  we  require  the  following  lemma. 

2 2 

Lemma  3.  If  and  t are  two  functionals  of  the  form  (1.10) 

with  7=2  and  weight  functions  A^,  A?,  and  if  the  A^  have 
! 

densities  A^  satisfying 


(4.14) 


0 = a = Ag(t)/A^(t)  * A < k>  , 


then  the  efficiency  of  the  estimator  of  i 2 to  that  of  satisfies 


(4.15) 


e2  j(F)  - ®2/A2 


Proof.  Condition  (4.14)  clearly  implies  that  *1  (F)  = ai;^(F) 

for  all  F.  On  the  other  hand,  it  follows  from  Theorem  3 of  BL  II 

2 2 2 

that  v0(F)  « A v^(F)  , and  the  conclusion  follows. 

The  theorem  now  follows  on  taking  and  i g as  t(F;  a,  p) 

and  the  SD  respectively  and  applying  Remark  1 following  Theorem  2. 
5.  The  pth  power  deviations 

Although  we  have  not  found  a measure  of  dispersion  which  is 
robust  in  the  sense  of  BL  (I, II)  and  whose  efficiency  relative  to 
the  SD  has  a positive  lower  bound,  it  is  possible  to  find  measures 
which  are  more  robust  (in  the  sense  of  BL  II)  than  the  SD  and  which 
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achieve  such  a lower  bound.  In  fact,  restrict  attention  to  distri- 
butions with  finite  pth  moment  and  consider  the  pth  absolute 
power  deviation 


(5.1) 


•t2  , = (E|X  - p|)1/p 


(P) 


for  1 = p = 2,  and  the  associated  estimator 

(5'2>  6(p)  - ' ^lP)1/P  > 

where  from  now  on  we  shall  assume  wlog  that  p,  = 0. 

Below  we  shall  obtain  lower  bounds  for  the  efficiency  of  (5.2) 


relative  to  the  SD  for  the  family  of  all  symmetric  distributions. 


This  bound  can  be  improved  if  we  restrict  attention  to  the  family 
3^  of  symmetric  unimodal  distributions  and  improved  still  further 


for  the  family  3 of  scale  mixtures  of  normal  distributions  with 


a common  mean. 


Theorem  b.  Let  e(F;  p,  q)  denote  the  efficiency  of  the  pth 
relative  to  the  qth  absolute  power  deviation  where  1 £ p < q. 
Then 

< 

1 O O 

for  i = 0 


(5.3) 


2,  2 

p /q 


lnbi  e'F:  p>  q>  - ] *2  Itft  ($)  for  1 = 1 


2 

£_  r(q-p^)  L Ed 


l2  r2(S±l) 


for  i * 2 


Before  proving  these  results,  let  us  note  that  for  p =»  1,  q --  2 
we  obtain  for  the  efficiency  of  the  absolute  deviation  to  the  star.du; 
deviation  the  values 


-Mi- 

ll 

.250 

for 

27  . 
So  " 

.337 

for 

1.5  = 

7T 

.477 

for 

These  bounds  are  rather  low  and  can  be  improved  (at  the  cost 
of  some  robustness)  by  taking  a larger  value  of  p.  For  example, 
if  p = 1.5,  the  values  are 


.563 

for 

i =*  0 

.648 

for 

i = 1 

.851 

for 

i - 2 

The  proof  of  all  three  parts  of  the  theorem  hinge  on  the  follov 
ing  lemma,  v.hich  is  equivalent  to  the  case  i = 0. 

Lemma  4.  Let  V be  any  nonnegative  random  variable  and 


(5.4) 

Then,  if  1 ^ a = p, 

(5.5) 


Ha  - < 


^2a  < ^2f3 


2 2 

^p 


with  equality  if  and  only  if  V is  a positive  constant  with 
probability  1 


po 


Proof.  To  see  this,  note  that  log  p.  is  a convex  function 
of  a (see  for  example  Loeve  (19.55),  p.  155)).  From  this  it  follows 
easily  that 


log  P^-log  [A 


2a 


log  - log  u 


a 


p 


-a) 


P - a 


2 2 

and  hence  that  p.0Q/p.0  = p._  /p.  as  was  to  be  proved. 

2p  p ,«  a 

Proof  of  part  (i)  of  Theorem  4.  By  the  central  limit  theorem 
we  find  easily  that  if  the  required  moments  exist,  we  have 


(5.6) 


e(F;  p,  q)  = 


2q 


E |X| 


The  result  then  follows  immediately  from  the  lemma. 

Proof  of  parts  (ii)  and  (iii)  of  Theorem  4.  Note  first  that 
both  of  these  families  are  of  the  following  type: 

(5.7)  7 ™ CF:  X ~ F,  X = ZW,  Z is  fixed  and  W is 

independent  of  Z and  varies  freely}. 

In  case  (ii)  this  is  achieved  by  taking  Z to  be  the  uniform  dis- 
tribution on  (-1,  1)  (see  Feller  (1966),  p.  155).  In  case  (iii) 
we  just  take  Z to  be  a standard  normal  variable.  To  complete  the 
proof  we  need  the  following  lemma. 

Lemma  5.  Let  7 be  as  in  (5.7)  with  Z symmetric  about  0. 

Let 

(5.8)  va  - E|Zl“  . 

Then  for  1 = p < q, 


(3-^, 


iaf  e(F ; p,  q)  - “ -P- 

q2  v2  V2p 


Proof.  Let 


_ EjW J 31 
E2|w|a 


Then  from  (5.6)  and  (5.7) 


2 (vn  /v  j w - 1 2 (v0  /vc)  p -fl/w) 

0(F;  p>  q)  - L 211  ■ g.-1 - 2, 

q (V2P/Vp  "p  ‘ 1 q (V2P/VP)  ■(1/Wp') 


whore  p = w /w  and  where  X = ZW  has  distribution  F.  Keeping 
p q P 

w fixed,  this  is  minimized  by  taking  for  p its  minimum  value 

2 2 

which  is  1 by  Lemma  4.  Since  by  Lemma  4,  v„/v  sv*/v  , ice  now 

* 2q  q 2p  p * 

obtain  a lower  bound  for  the  efficiency  by  letting  w^  -*•  To  shew 

that  this  lower  bound  is  sharp  we  need  to  exhibit  a sequence  of 

distributions  for  W such  that  p -+  1 and  w ->  *>.  Take, 

H P 


(5.10) 


U = 1 with  probability  tt 

= a with  probability  1 - 7T. 


Then, 


w_  = 


7T  + ( 1— 7t)  a£ 


^ (tt  + (1-7t)  a*5)  2 


As  a — *■ 


w — — — — and  p — ► 1 . 

p 1 - 7T-  1 

Now  letting  ir  -►  1 we  can  extract  the  requisite  sequence  of  distri- 
butions. The  lemma,  follows.  > 


2k 


; 


The  theorem  now  follows  from  the  standard  formulas  for  the 
absolute  moments  of  uniform  end  standard  normal  variables.  [1 
Remark.  Note  that  for  case  (iii)  for  the  trimm-.i  standard 
deviations,  an  extremal  sequence  is  provided  by  an  appropriate 
sequence  of  Tukey  models.  In  fact,  the  distributions  defined  by 
(5.10)  are  of  this  type. 

As  we  have  noted,  the  lower  bound  to  the  efficiency  of  the 
mean  deviation  is  rather  low  even  for  J . However,  as  we  fount', 
for  trimmed  standard  deviations,  in  reasonable  situations  the 
bound  is  very  conservative.  Here  are  some  numerical  results  for 
the  pth  power  deviation  for  p = 1,  1.5  and  a selection  of  the 
distributions  given  in  Tables  3. 1-3.3. 


f 

5 

10 

25 

50 

00 

p = 1 

2.35 

t 

1.12 

.94 

.91 

.88 

P - 1.5 

1.88 

1.12 

1.01 

.99 

.97 

Table  5.1.  Asymptotic  efficiency  of  pth  power 
deviations  with  respect  to  standard 
deviation. 

t — distribution  with  f degrees  of  freedom. 


r 

.5 

1.0 

2.0 

4 

6 

8 

10 

20 

30 

CO 

P = 1 

.53 

l 

,6o 

.68 

.75 

CO 

• 

.80 

.81 

.84 

.85 

.88 

P = 1.5 

.76 

.30 

Lf\ 

CO 

• 

.89 

.92 

.93 

.94 

.96 

.97 

• 97 

5.2.  Asymptotic  efficiency  of  pth  power  deviation 
with  respect  to  SD. 

Beta  distribution  with  density  (3.6) 


’I 


1 


P = 1.5 


25 


£ i 

-1 

2 

4 

6 ] 

2 

4 

6 

° ! 

.88 

.83 

.88 

.97 

.97 

.97 

.025  ! 

1.06 

3.08 

5.18 

1.10 

1.99 

2.33 

.07d 

j 1.22 

2.53 

2.66 

1.18 

1.61 

1.51 

.10 

1.25 

2.21 

2.14 

1.19 

1.48 

1.35 

.20 

1.24 

1.50 

1.31 

1.17 

1.20 

I.09 

.40 

1.09 

l.o4 

.92 

1.08 

1.02 

• 96 

.50 

. 1.03 

.95 

.85 

1.05 

.99 

.94 

Table  5.3.  Asymptotic  efficiency  of  pth  power 

deviation  with  respect  to  standard  deviation. 
Tukey  model  (3.7) . 


Qualitatively  the  behavior  of  these  measures  closely  parallels 
that  of  the  corresponding  Tables  3. 1-3.3.  For  reasonable  distri- 
butions the  moan  deviation  particularly  seems  to  do  even  better 
than  the  trimmed  deviations.  Of  course,  for  sufficiently  heavy 
tailed  distributions,  for  example  t-die trtbutiono  with  sufficiently 
low  degrees  of  freedom,  it  can  break  down  badly. 

6.  Measuring  the  scale  of  positive  random  variables 

The  concepts  and  results  developed  so  far,  also  apply  to  a 
somewhat  different  problem.  Consider  a random  variable  X with 
distribution  F,  which  is  known  to  be  positive.  Then  one  may 
be  interested  in  scaling  this  distribution  by  defining  a suitable 
measure  o of  its  distance  from  zero.  Of  such  a measure  we  shall 
require  (in  analogy  with  the  earlier  ay.ic^s  for  i) 


m 


(6.1) 

and 


u(aX)  = ao(X) 


for  a > 0 
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f 

| 

r 


\ 


{6.2)  a( Y)  i a(X)  if  Y is  stochastically  larger  then  X. 

There  is  a simple  correspondence  between  such  measures  and 
the  earlier  measures  a,  defined  by 

(6.3)  i(X)  - o(|X-n|) 

where  p.  = p as  before  denotes  the  center  of  symmetry  of  X. 

Given  a measure  c defined  over  positive  random  variables  and 

satisfying  (6.1)  and  (6.2),  let  X be  a symmetric  variable  whose 

center  of  symmetry  is  denoted  by  p.  Then  (6.3)  defines  a measure 

satisfying  (1.4),  (1.5)  and  (1.9).  Conversely,  let  a be  a 

measure  of  dispersion  defined  over  symmetric  random  variables, 

and  let  Z be  any  positive  random  variable.  Extend  Z to  negative 

values  so  that  it  is  symmetric  about  0 and  denote  the 

resulting  random  variable  by  X.  Then  p = 0 and  |X-p|  * |x|  =Z.  ^ 

The  measure  o (Z)  defined  through  (6.3)  satisfies  (6.1)  and  (6.2). 

Examples  of  scale  measures  a are  provided  by  the  median 

of  a positive  random  variable  X,  by  the  first  moment  E(X)  , by 

/ 2~ 

the  square  root  of  the  second  moment  vE(X  ) , or  by  trimmed 
versions  of  these  latter  measures.  Using  (6.3)  , it  is  a trivial 

j 

matter  to  adapt  the  results  of  Sections  3 and  4 to  comparisons  of 

/ 2~ 

the  estimator  of  V E(X  ) with  its  trimmed  versions.  This  is, 
however,  not  quite  appropriate  since  the  scale  measures  of  greater 


it  . j ' 


Interest  for  positive  random  variables  are  those  corresponding 
to  E(X)  and  its  trimmed  versions. 

We  believe  that  the  results  of  numerical  and  theoretical 
comparisons  of  the  sample  mean  and  its  trimmed  competitors 
qualitatively  will  be  quite  similar  to  those  obtained  for  the 
sample  SD  and  its  trimmed  competitors,  but  we  have  not  carried 
out  this  program.  In  addition  to  this,  Theorem  4 of  Section  5 
reveals  that  the  sample  mean,  which  is  the  estimator  of  E(X)  , 
has  for  p > 1 efficiency  bounded  from  below  with  respect  to 
estimators  of  [E(XP)  ]^p  for  the  families 

3^  = (All  distributions  on  the  positive  axis  with 
finite  second  moment} 

3 = (All  members  of  3^  with  monotone  nonincreasing 
densities] 

3 ^ = (All  members  of  3^  which  are  scale  mixtures  of 
half  normal  distributions}. 

7.  Unknown  center  of  symmetry 

In  estimating  dispersion  we  have  so  far  restricted  attention, 
to  symmetric  distributions  and  have  assumed  the  center  of  symmear- 
to  be  known . If  the  center  p of  symmetry  is  unknown,  It  io 
tempting  to  estimate  p by  a suitable  estimate  of  location  and 
to  substitute  this  for  p in  the  estimator  of  a.  The  question 
natvrally  arises  what  effect  this  has  on  the  asymptotic  distribu- 


tion and  hence  on  the  efficiency  of  the  estimator.  For  the 


pfj 


measures  of  dispersion  considered  in  this  paper  it  turns  out  that 

under  suitable  regularity  conditions  the  asymptotic  distribution 

is  unchanged  by  this  substitution.  We  have  not  investigated  the 

small  sample  behavior  of  these  procedures. 

The  theorem  below  gives  a simple  sufficient  condition  for 

substitution  of  an  estimate  of  location  to  work  in  this  sense. 

Wc  subsequently  check  that  this  condition  is  satisfied  by  trimmed 

standard  deviations  and  pth  power  deviations  among  others. 

To  formalize  the  process  of  substitution  we  appeal  to  the 

discussion  of  Section  6 in  which  we  indicated  that  there  is  a 1-1 

correspondence  between  measures  of  dispersion  a for  symmetric 

distributions  and  measures  of  scale  a for  positive  random 

variables  via  (6.3) . Let  us  start  then  with  such  a a defined 

for  nonnegative  variables.  For  reasons  similar  to  those  given 

at  the  beginning  of  Section  2,  we  need  to  consider  extensions 

namely 

of  a to  larger  families  of  distributions ,/ to  the  family  of  all 
distributions  F for  which  o(|X|)  is  defined.  To  avoid 
proliferation  of  notation  we  also  call  the  extension  o and 
define  it  by, 

O(X)  . <j(|X|). 

For  example,  if  cr2(X)  = x2dF  for  nonnegative  variables,  the 

o 

extension  of  a is  the  second  moment.  If  a(X)  is  the  ir.cdi.Ui 
of  the  distribution  on  (0,  ») , the  extension  is  the  median  of 
the  distribution  of  |x|. 


25 


Let  p(F)  be  a measure  of  location  satisfying  (1.1)  and  (?.,?'} 
of  BL  II.  Suppose  also  that  Xp  • • • ,Xn  are  independently  dis- 
tributed each  with  distribution  F and  let  F be  the  empirical, 
c.d.f.  Define, 

A A 

p « p(F) 

as  the  usual  estimator  of  p(F)  and  let 


F(x)  = F (x  + p) 


M- 


denote  the  c.d.f.  of  X - p.  Note  that  the  empirical  c.d.f.  of 


- p,  • • • ,X.n  - p is  then  F^  . 

If  F is  symmetric  about  p the  measure  of  dispersion 
corresponding  to  a by  (6.3)  is, 


1(F)  =C(FU). 


The  estimator  of  i we  have  used  for  known  p is  cr(F  ). 


4 


Failing  this  knowledge  we  use  i defined  by, 


T = c ( F ) . 

$ 


For  example  if 


^(F)  = /loo  XdF(J0 


02(f)  - In  xSda(x) 


where  G is  the  c.d.f,  of  ] X j then, 


ar.d  i is  the  sample  SD. 
If 


— — 


a'* 


M.(F)  = F_1(|) 


o'F)  = G'1(-1;! 


tli  n 'i  is  the  median  of  the  absolute  deviations  from  the  median 


of  the  observations. 


Theorem  5.  Suppose  that  the  underlining  distribution  F 


is  symmetric  about  4 and  suppose  without  loss  of  generality 


that  4=0.  Suppose  further  that, 


(7.1) 


o(F  ) is  differentiable  in  4 at  4=0, 


(7.2)  UV^  lim  suPn  plVnl£l  « M]  = 0. 


(7.3) 


iinfyo  lim  suPn  P[sup(Vnia(F^j  -r(F^)  -o( F)  + a(F)  |: 


4|=6}^b]=0  for  all  e > 0. 


Then, 


(7.4) 


_ A A p 

Vn(i  -o(F))  4 0, 


If  we  assume  that  t(F)  is  positive  and 


(7.5) 


C(F)  i 1(F)  , 


then  we  can  arrive  at  conclusion  (7.4)  even  if  throughout  (7.3) 
we  replace  0 by  where  p > 0. 


A —A 

Note:  4 will  satisfy  (7.2)  provided  that  Vn  4 has  a 


limiting  normal  distribution. 


Ey  Slutsky's  theorem  (7.4)  implies  that  Vn(T  - o(F))  and 


/\ 

Vn(cr(i’)  - a ( - ) ) have  the  sane  limiting  distribution.  Bu.  since 


F is  symmetric  about  0, 


► 


3? 


o(F)  = i(F) 

and 

d(F)  = l(f'), 

the  estimate  we  would  use  if  the  center  of  symmetry  of  F were 
known.  Thus,  (7.4)  implies  that  7n(T-'t(F))  and  Vn(t  (F)  -'i'(F)) 

have  the  same  limiting  distribution.  That  is,  if  (7.4)  holds 
substitution  works. 

Proof.  From  (7.2)  and  (7.3), 

(7.6)  '7n[o  - <y(F.)  - o(F)  +u(F)]io. 


A p 


Since  p.  -*■  0 by  (7.2  we  can  apply  (7.1)  to  conclude  that 


(7.7) 


V n 


(o(F.)  -0(F))  - 2 {§2-  (F„) 


M- 


k' 


yuo. 

u=oJ-l 


If  F is  symmetric  about  0,  it  follows  from  (1.4)  and  (1.6)  that 


do 


c (7  ) is  an  even  function  of  u.  and  hence  that  t~(F  ) 

N u ’ o|i  |x 


= 0. 


M-"-0 


Substituting  in  (7.7)  and  (7.6)  completes  the  proof  of  (7.4). 

If  now  o is  replaced  by  a*5  in  (7.3)  we  can  imitate  the 
proof  of  (7.4)  exactly  to  conclude  that, 


(7.8)  /n(op(F-)  - op(F))  l 0. 

But, 

-,/5(J-o(F))  = poP-1  V5(op(F.)  - op(F) ) 

1 A ^ 

vrh0,'e  p =■=  --  and  o lies  between  a(F*)  and  a(F)  . By  (7.5) 

p M- 

and  (7.8) 

a -S  o ( F)  >0 

and  (7.4)  follows,  Q 


I*  ■' armies 

(i)  The  pth  power  deviations 

Define  i (H)  = (1°°^  |x  | pdH(x)  ^ p for  all  H such  that 

/ |x  |pdH(x)  <oo.  Then  i(F  ) is  the  pth  power  deviation.  We  shall 

M* 

ctieck  that  (7.3)  holds  for  -ip  under  the  condition 


(7-^) 


/ |x  | PdF(x)  < oo  . 


For  p < v,  we  find 

E(Vn  (tP(Fv)  - lP(y  - lP(Fv)  +nP(F(i)))2 

= Var(  |XX  - v|p  • |XX  - n |p)  . 

If  p > 1 this  variance  is  bounded  by, 

EdXj-vjP  - \X1  - p|P)  2 ^ P2(v-p)2  sup(E|X1-X|2p“2: 

A € [p,  v) ) = C(v-p)2  if  | v | , |p|  = M, 

where  G is  a constant  depending  on  M.  If  p = 1, 

VarflXj  - v|  - |X1  - nl)  S E[(v-H)(21[x  j - 1) 

+ 2(V-X1)I(  <v]]2Sc(v-a)2  if  |v|,  |n|  S M. 

In  any  event, 

E(  vn(ip(Fv)  - ^p(y  - tp(Fv)  +iP(Fi)))2  S C(v  -n)2 

and  an  argument  using  Theorem  12.3,  p.  95  in  [3]  completes  the 
proof.  Since  (7.9)  implies  (7.5)  as  well  as  the  asymptotic  norma.:'  1 ty 
of  t(F)  wo  see  that  any  measure  p satisfying  (7.2)  , for  in- 
stance the  mean,  can  be  sub3tJ.tut.ed  without  affecting  the  asymptotic 


beL'.’" 1 or  CV  i»uC  estimators. 
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{ 1 i)  (L)  esCJ mtors 

Suppose  i is  given  by 

(7.10)  t(H)  = [/^ |H_1(t) |7  dA(t)l1/7 

~ 1 

where.  A is  symmetric  about  — . 

If  F is  symmetric  about  p,  and  if  we  define  A(t)  = 
2A(~— ) - 1,  then  t (F  ) is  the  measure  defined  in  (1.10). 
Suppose  that 

(7.11)  A places  no  mass  outside  an  interval  (^,  1 - 7))  , 

0 < a < 1/2 

(7.12)  F is  differentiable  with  derivative  f which  is 
positive  and  continuous. 

We  shall  now  show  that  (7.3)  holds  for  under  these 

assumptions,  so  that  we  may  safely  substitute  an  estimate  of 
location  in  the  singly  and  doubly  trimmed  standard  deviations. 
Here  is  a sketch  proof. 

Let 

G(x,  n)  = P[  |XX  -n|7  £ x], 

and  let  G(x,  p)  be  the  empirical  c.d.f.  of  |X^-p 1^, • • • , |X  -p |^. 
Define 

G(x)  « G(x,  0)  , G(x)  = G(x,  0) . 

“I  A-l 

Finally  lot  G (t,  p)  , G (t,  p)  be  the  corresponding 
inverses  (in  x)  . 


i 


Note  that 


t(F(i)  - 2/*-01  G’1(C,  u)dA(i|£)  =■  G_1(t,  ll)dA(t), 


and  sellar  assertions  hold  for  F,  $ , etc. 
Then, 

(7.15)  I^F,)  - i7(F)  - a7(F^)  + i7(F)  | 


= iio'0(G_1(t.  H)  -G_1(t)  - G~1(t,n)  +G"1(t))dA(t) 

S sup{  |G'1(t,|i)  - G'1(t)  - G"1(t,p)  +G_1(t)  | : 

0 = t = 1 - a). 


Therefore,  we  need  only  establish  that 

(7.14)  lj.m  lim  supnP[  sup(  Vn|G"1(  t,|a)  -G-1(t)  - G"1(t,|a)  +G~1(t)j : 

0 = t = 1 - a , |fx|  = 6)  = e ] = 0. 

V.y  a standard  but  tedious  argument  (see  [1]  and  [8])  , we  con 
show  that  (7.14)  follows  if,  for  every  M 

(7.15)  tmpC  |G‘1(t,n)  -G‘1(t,p)  | : 0 S t = 1 - a,  |4|  = M)  £ 0 


and 


dG 


(7.16)  j(t,p)  = -^-(t,|i)  is  continuous  on 

((t,n):  0 2?  t = 1-|,  Im-I  = M) 


and  if  in  addition 

(7.17)  liin6jo  lim  suPn  P[GUP'[ 


• Vn 


A 


X.d)  - G(x,m) 


j(G(x,  n)  , n) 


= 0, 


— 2 ^ M|i|lgl"lll"liMl<iMWill 


This  assertion  is  a consequence  of  the  Glivenko -Cantelli  theorem 

A 

and  the  definition  of  G,  G. 

Since 


j(t,  [i)  = 77 

g(G  (t,p.)  , (i) 


=■  [£{c-1(t,n)  + nf  + Cf-G'kt.p)  +n]"\ 


assertion  (7.16)  is  immediate  from  assumption.  Finally,  (7.17) 
follows  from  (7.16),  the  tightness  of  the  empirical  process 

linKj0  lira  sup^  P[  sup(  Wn  |F(x-Hjl)  - F(x-fp)  -F(x)  + F(x)  |: 

I M- 1 = 6,  |*i  - M)  = e]  = 0, 

A 

and  the  boundedness  (in  probability)  of  sup  Vn|G(x)  - G(x)  |. 
Therefore,  (7.13)  holds  for  t defined  by  (7.10).  As  before  (7.1) 
and  (7.5)  are  usually  obvious  under  our  assumptions  and  we  conclude 
that  substitution  of  location  estimates  satisfying  (7.2)  such  as 
the  median  is  legitimate. 


How,  (7.15)  follows  from  the  fact  that  for  every  M and  M1 


(7.13) 


svp(  !g(x,  p.)  -G(x,  p)  | : x M1  , |p.|  = M)  -+  0. 


tonaaalMiifci  tiHd  -'-'i «. 
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